Theys etal. Retrovirology 201 2, 9:81 
http://www.retrovirology.eom/content/9/81 




RETROVIROLOGY 




Treatment-associated polymorphisms in 
protease are significantly associated with 
higher viral load and lower CD4 count in newly 
diagnosed drug-naive HIV-1 infected patients 



Kristof Theys 1 *, Koen Deforche 2 , Jurgen Vercauteren 1 , Pieter Libin 2 , David AMC van de Vijver 3 , 

Jan Albert 4 , Birgitta Asjo 5 , Claudia Balotta 6 , Marie Bruckova 7 , Ricardo J Camacho 8,9 , 

Bonaventura Clotet 10 , Suzie Coughlan 11 , Zehava Grossman 12 , Osamah Hamouda 13 , 

Andrzei Horban 14 , Klaus Korn 15 , Leondios G Kostrikis 15 , Claudia Kucherer 13 , Claus Nielsen 17 , 

Dimitrios Paraskevis 18 , Mario Poljak 19 , Elisabeth Puchhammer-Stockl 20 , Chiara Riva 6 , 

Lidia Ruiz 10 , Kirsi Liitsola 21 , Jean-Claude Schmit 22 , Rob Schuurman 23 , Anders Sonnerborg 24 , 

Danica Stanekova 25 , Maja Stanojevic 26 , Daniel Struck 22 , Kristel Van Laethem 1 , 

Annemarie MJ Wensing 23 , Charles AB Boucher 3,23 , Anne-Mieke Vandamme 1 ' 8 and on behalf of the 

SPREAD-programme 



Background: The effect of drug resistance transmission on disease progression in the newly infected patient is not 
well understood. Major drug resistance mutations severely impair viral fitness in a drug free environment, and 
therefore are expected to revert quickly. Compensatory mutations, often already polymorphic in wild-type viruses, do 
not tend to revert after transmission. While compensatory mutations increase fitness during treatment, their presence 
may also modulate viral fitness and virulence in absence of therapy and major resistance mutations. We previously 
designed a modeling technique that quantifies genotypic footprints of in vivo treatment selective pressure, including 
both drug resistance mutations and polymorphic compensatory mutations, through the quantitative description of a 
fitness landscape from virus genetic sequences. 

Results: Genotypic correlates of viral load and CD4 cell count were evaluated in subtype B sequences from recently 
diagnosed treatment-naive patients enrolled in the SPREAD programme. The association of surveillance drug 
resistance mutations, reported compensatory mutations and fitness estimated from drug selective pressure fitness 
landscapes with baseline viral load and CD4 cell count was evaluated using regression techniques. Protease 
genotypic variability estimated to increase fitness during treatment was associated with higher viral load and lower 
CD4 cell counts also in treatment-naive patients, which could primarily be attributed to well-known compensatory 
mutations at highly polymorphic positions. By contrast, treatment-related mutations in reverse transcriptase could not 
explain viral load or CD4 cell count variability. 
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Conclusions: These results suggest that polymorphic compensatory mutations in protease, reported to be selected 
during treatment, may improve the replicative capacity of HIV-1 even in absence of drug selective pressure or major 
resistance mutations. The presence of this polymorphic variation may either reflect a history of drug selective 
pressure, i.e. transmission from a treated patient, or merely be a result of diversity in wild-type virus. Our findings 
suggest that transmitted drug resistance has the potential to contribute to faster disease progression in the newly 
infected host and to shape the HIV-1 epidemic at a population level. 



Background 

Following initial HIV-1 infection, the rate of clinical dis- 
ease progression reflects the complex interplay of host- 
and virus-related as well as socio-economic factors. This 
highly variable rate can be assessed and predicted by 
monitoring the evolution of prognostic markers such as 
the number of viral particles in the plasma (viral load 
or viremia) and CD4+ T-lymphocytes cell count (CD4 
count). Constituting the only current strategy to delay 
disease progression, the primary goal of antiretroviral 
therapy (ART) is to maximally inhibit viral replication and 
to aim for immunological reconstitution. However, accu- 
mulation of drug resistance mutations during suboptimal 
therapy severely affects the clinical benefit of ART, leading 
to therapy failure [1]. 

HIV-1 evolutionary dynamics under selective pressure 
of ART are largely governed by competitive fitness, to 
which viral replication, phenotypic drug resistance and 
intrinsic replicative capacity (RC) contribute. While an 
increased ability of the virus to replicate in the presence of 
drug results from decreased phenotypic drug susceptibil- 
ity, major drug resistance mutations reduce the inherent 
ability of HIV-1 to replicate in absence of drug (replica- 
tive capacity). Hence, virus evolution is characterized by 
repair strategies that include compensatory mutations in 
the targeted gene [2], Despite these compensatory effects, 
drug-resistant viruses tend to replicate less efficiently than 
wild-type viruses in absence of treatment, which is exem- 
plified by the fact that archived wild-type viruses become 
again predominant during treatment interruption [3]. 

The transmission of drug resistance (TDR) among 
adults recently infected in North-America and Europe is a 
consequence of the widespread use of antiviral agents and 
related resistance accumulation in the ART-experienced 
population. A large survey of 17 European countries 
reported a TDR prevalence of 9.0% among newly diag- 
nosed persons [4-6]. Transmitted resistant virus was 
initially believed to become irrelevant over time, as it 
would gradually disappear from the dominant quasis- 
pecies: mutations reverting to wild-type or alternative 
amino acids reflect the impaired fitness of TDR variants 
in absence of drug pressure and wild-type virus. However, 
persistence of TDR variants as dominant quasispecies, 
within a new host and within transmission chains, has 
been reported. Prevention of wild-type state reversion 



may result from absence of competition in the founder 
virus population [7], the existence of steep fitness val- 
leys between resistance mutant and original wild-type [8], 
restoration of fitness through selection of compensatory 
mutations [9] or a combination of these factors. However, 
the relationship between fitness cost and persistence may 
be complex due to mutational interactions [10], and early 
immune responses may as well influence the reversion of 
TDR. 

CD4 cell loss is a prognostic marker for development of 
clinical symptoms and progression towards AIDS. HIV- 
1 isolates vary widely in features that determine viral 
fitness and virulence. For HIV-1, it might seem rea- 
sonable to infer that increased viral fitness coincides 
with elevated pathogenicity, since an inverse relation- 
ship between the viral load and the rate of CD4 decline 
is often observed. Under such assumption, presence of 
transmitted drug resistance mutations that impair repli- 
cation capacity could result in lower viral loads, thereby 
sustaining CD4 cell counts and delaying disease progres- 
sion [11]. However, due to their high fitness costs in a 
drug free environment, major resistance mutations tend 
to revert after transmission. Although currently unknown, 
mutations selected during treatment for their compen- 
satory fitness effects could be more persistent, since these 
mutations can also occur as natural polymorphisms [12]. 
Consequently, these compensatory mutations may modu- 
late replication capacity of the virus not only in presence 
but also in absence of detrimental, major resistance muta- 
tions [13,14], and if so, it remains unclear whether their 
presence will result in higher viral fitness and virulence. 

It has been observed that the viral genotype has a strong 
and direct effect on HIV-1 viral load [15]: the viral load 
of the transmitting partner is strongly predictive of the 
viral load in the recipient partner [16] and polymorphic 
variation associated with replicative capacity has been 
reported [17]. To what extent genetic variation resulting 
from transmitted drug resistance is influencing viral load 
or CD4 count in the recipient is not fully understood. 
Although transmission of drug resistance has been widely 
documented, whether drug selective pressure could shape 
the epidemic at a population level or alter the natural 
history of infection has been poorly investigated. In this 
study, we aimed to elucidate the potential of transmitted 
drug resistance to influence the natural history of HIV-1 



Theys etal. Retrovirology 201 2, 9:81 
http://www.retrovirology.eom/content/9/81 



Page 3 of 13 



infection at a population level. We investigated whether 
polymorphic compensatory mutations, reported to be 
enriched during therapy and therefore contributing to 
fitness under drug selective pressure, were also associ- 
ated with increased fitness and virulence in patients never 
exposed to ART. For this purpose, we used a model that 
correlated fitness of genetic variants with drug selective 
pressure, and verified whether increased fitness under 
such a model also correlated with increased fitness in 
newly diagnosed treatment naive individuals, as measured 
by viral load and CD4 cell count. The effect of known 
TDR and compensatory mutations was also investigated 
in support of our hypothesis. 

Results 

Descriptive characteristics of study population 

The analysis was restricted to HIV-1 subtype B sequences 
in order to minimize inter-subtype variability in the num- 
ber and prevalence of polymorphic mutations and to 
exclude possible confounding effects on disease progres- 
sion [18]. The analyzed SPREAD dataset contained 1782 
newly diagnosed individuals that were infected with a 
HIV-1 subtype B virus (65%). A measurement of viral 
load and CD4 count was available for 1599 patients. 
Baseline characteristics of these patients are described 
in Table 1. Genotypic evidence of TDR was detected in 
9.4% ( 95%-CI: 8.1-11.0) of patients, including 51 patients 
with PI (3.2%, 95%-CI: 2.4-4.2), 84 patients with NRTI 
(5.3%, 95%-CI: 4.2-6.4) and 41 patients with NNRTI (2.6%, 



95%-CI: 1.9-3.5) resistance. Studies have suggested that 
patients with evidence of TDR may harbour virus with 
impaired replication capacity potentially leading to a less 
pathogenic virus. However, patients with or without indi- 
cations of TDR had overall similar characteristics, and no 
significant difference in viral load (p-value = 0.52) and 
CD4 cell count (p-value = 0.14) could be detected. 

Estimating the fitness of the subtype B sequences under 
drug selective pressure 

Since variation in viral load and CD4 cell count could 
not be explained by the presence or absence of TDR, 
we assessed to what extent variability in these baseline 
parameters could be explained by mutations and poly- 
morphisms that contribute to increased in vivo fitness 
under drug selective pressure. To this aim, two fitness 
models (Fp/ and Vrti) assigned fitness values to genetic 
variants. These models were estimated by relating the 
increase in prevalence of mutations with selective advan- 
tage during treatment (Additional file 1, [19]). Given 
that epistatic interactions alter the fitness impact of a 
mutation depending on the context of other mutations, 
we considered not only individual mutations, but also 
the increase in prevalence of mutation patterns. Conse- 
quently, the relative contribution of a mutation to the 
estimated fitness varied depending on to the presence of 
other mutations with which it interacts. The highly com- 
plex fitness function returned for each genotype a single 
fitness value based on the different mutations present in 



Table 1 Characteristics of HIV-1 subtype B patients included in the analyses for prediction of viral load and CD4 cell count 



Characterististics 


Subtype B patients (N = 1599) 


TDR(N = 151) 


Wild-type (N = 1446) 


log 10 HIV-RNAcopies/m 


4.85 (4.32-5.35) 


4.83 (4.30-5.33) 


4.86(4.32-5.35) 


CD4 cells/mm 3 


382 (212-583) 


410(249-582) 


377(209-583) 


Age, years 


35 (29 - 42) 


34 (28 - 40) 


35 (30 - 42) 


Male sex , n (%) 


1413(89%) 


137(91%) 


1276 (89%) 


Duration of infection, n (%) 








< 1 year 


541 (34%) 


57 (37%) 


484 (33%) 


Undefined 


1056(66%) 


94 (63%) 


962 (67%) 


Source of HIV-1 exposure, n (%) 








Homo/bisexual contact 


1016(65%) 


1 04 (69%) 


912(63%) 


Heterosexual contact 


1 23 (20%) 


26 (1 7%) 


290 (20%) 


Intravenous drug use 


316(8%) 


8 (5%) 


115(8%) 


Other 


142 (9%) 


13(9%) 


129 (9%) 


Area of origin, n (%) 








Western Europe 


1147 (72%) 


1 05 (70%) 


1042 (72%) 


Eastern Europe & Central Asia 


282(18%) 


27(19%) 


255 (18%) 


Other 


168(10%) 


19(11%) 


149(10%) 



Patient characteristics are shown for the subtype B study dataset (N = 1 599), including patients with genotypic evidence of transmitted drug resistance (TDR) and 
patients without genotypic evidence of transmitted drug resistance (wild-type). Data are expressed as median values accompanied by interquartile ranges, or as 
number of patients accompanied by proportion of the subtype B dataset (%). 
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the genotype. Using the fitness models (Fp/ and ¥rti), 
we computed for each sequence the estimated in vivo 
fitness of respectively subtype B protease and reverse tran- 
scriptase. Mutations used in these models outnumbered 
the TDR mutation list [20], and included in addition to 
major resistance mutations also polymorphic, compen- 
satory mutations (see Additional file 2 for a complete 
list of the fitness mutations and their prevalence in the 
study population). Both types of mutations were assigned 
higher weights in our fitness landscapes. Since it has 
been reported that major resistance mutations decrease 
replicative capacity in absence of selective pressure [1], we 
hypothesized that treatment-related polymorphisms (so- 
called compensatory mutations) may increase fitness also 
in absence of drug selective pressure. In one approach to 
evaluate the contribution of these compensatory muta- 
tions, a second fitness estimate was calculated for those 
sequences with major drug resistance mutations (11.4%) 
by reverting the major resistance mutation to the wild- 
type amino acid (Fp/_ m and FRTi-m)> thereby excluding 
the fitness contributions of major resistance mutations. 
The different fitness estimates of the sequences are sum- 
marized in Table 2, with the range of estimated fitness 
being more dispersed for PR than for RT. Computed fit- 
ness values were significantly higher in patients with TDR 
compared to patients without TDR, except for the esti- 
mated fitness of PR when the effect of major resistance 
mutations was excluded (login Fp/_ m , p-value = 0.427). 

Evaluating genotypic predictors of viral load and CD4 cell 
count 

The following genotypic predictors of viral load and CD4 
were investigated: the presence of TDR, estimated fitness 
under a drug selective pressure model (login Fp/, login 
¥rti) and the number of known compensatory muta- 
tions [21]. However, previous studies showed that levels 
of viral replication (i.e. viral load) and immune deple- 
tion (i.e. CD4 cell count decline) result from a com- 
plex interplay of between host and virus characteristics 
including time since infection, trends in TDR over calen- 
dar year, socio-economic factors, access to medical care 
and more. In our study, indications of acute infection at 
the time of diagnosis were significantly associated with 



higher viral loads (4.95 login copies/mL [IQR: 4.4-5.4 
login copies/ml] versus 4.82 login copies/mL [IQR: 4.3-5.3 
login copies/ml], p-value = 0.015) and higher CD4 counts 
(509 cells/mL [IQR: 360-661 cells/mL] versus 314 cells/mL 
[IQR: 134-503 cells/mL], p-value < 0.001). The presence 
of TDR was however not significantly higher in individ- 
uals recently infected than in individuals with unknown 
duration of infection, neither overall TDR (10.7% vs 8.9% 
respectively, p-value = 0.25), nor PI (4.2% versus 2.6%, 
p-value = 0.11) or RTI (7.0% vs 6.9%, p-value = 0.96) sepa- 
rately. The log fitness estimates and their range were also 
not significantly different (data not shown). 

To account for duration of infection and other poten- 
tial confounders, square-root transformed CD4 counts 
and loglO transformed viral load were modeled as a lin- 
ear function of TDR or estimated in vivo fitness. Since 
especially recent infection was anticipated to bias the 
results, the simplest model (Model 1) included predic- 
tors derived from the genotype (TDRp/, TDRrti, login Fp/, 
login Yrti) and the duration of infection (recently infected 
vs unknown duration of infection). In the fully adjusted 
analyses (Model 2), risk group, age, country of origin and 
gender were added as explanatory variables (Table 3). In 
both models, there was no significant association between 
infection with TDR and viral load or CD4 count. How- 
ever, higher estimated fitness under the PI but not under 
the RTI drug selective pressure model (login Fp/ respec- 
tively login Frti) was significantly correlated with higher 
viral load and lower CD4 count. To verify whether a sin- 
gle country introduced a bias, regression analyses were 
repeated by iteratively excluding data from one country, 
which did not change our results (data not shown). These 
observed correlations indicate that in treatment-naive 
patients, mutations in PR that give a selective advantage 
under drug selective pressure and are not included in the 
TDR list, are associated with better replicating virus in 
vivo. 

The analysis was subsequently extended by two addi- 
tional approaches to further investigate the role of genetic 
determinants and to exclude a trend of TDR over 
calendar year as a confounding factor. In a first approach, 
we reverted major resistance mutations present in the 
viral sequences (login Fp/_ m , login Vrti-m) (Table 3). 



Table 2 Estimated fitness according to the fitness landscape models 


Factors 


Subtype B patients (N = 1 599) 


TDR (N = 151) 


Wild-type (N = 1446) 


p-value 


log 10 F « 


0.38(0.24-0.56) 


0.44(0.26-0.66) 


0.38 (0.24-0.56) 


0.001 


'og 10 Fffl-m 


0.38 (0.24-0.55) 


0.41 (0.25 - 0.57) 


0.38 (0.24 - 0.56) 


0.427 


log io Frti 


0.04 (-0.03-0.11) 


0.14(0.03-0.27) 


0.03 (-0.03-0.10) 


< 0.001 


log 10 F«77-m 


0.04 (-0.03-0.11) 


0.08 (0.01 -0.17) 


0.03 (-0.03-0.11) 


< 0.001 



Logqo Fp/ is the estimated PR fitness and logio Fpi- m is the estimated PR fitness when the influence of major resistance mutations on fitness is excluded. Same 
nomenclature applies to fitness estimates of RT. Data are expressed as median values with the interquartile range between brackets. 
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Table 3 Regression analysis to predict viral load and CD4 count 





Model 1 




Model 2 




Factors 


log VL (95% CI) 


p-value 


log VL (95% CI) 


p-value 


TDR W 


-0.073 (-0.287- 0.140) 


0.500 


-0.080 (-0.290-0.129) 


0.451 


TDR RT/ 


-0.009 (-0.1 57 -0.1 38) 


0.903 


-0.003 (-0.147-0.142) 


0.971 


log io F P/ 


0.271 (0.122-0.420) 


0.000 


0.251 (0.104-0.398) 


0.001 


log io Frti 


-0.090 (-0.324- 0.144) 


0.449 


-0.1 33 (-0.363 -0.097) 


0.258 


log 10 Fp/-m 


0.314(0.156-0.472) 


0.000 


0.294 (0.137-0.450) 


0.000 


log io Fpj7_ m 


-0.1 17 (-0.459 -0.223) 


0.498 


-0.180 (-0.516-0.155) 


0.292 


Factors 


sqrt CD4 (95% CI) 


p-value 


sqrt CD4 (95% CI) 


p-value 


TDR H 


-0.357 (-2.31 3- 1.600) 


0.721 


-0.501 (-2.417-1.416) 


0.609 


TDR/J77 


0.507 (-0.844- 1.859) 


0.462 


0.362 (-0.963-1.688) 


0.592 


log io Fp/ 


-1.628 (-2.999 -0.257) 


0.020 


-1.540 (-2.891 - -0.1 89) 


0.025 


log io F R 77 


1.465 (-0.683-3.613) 


0.181 


1.583 (-0.527-3.692) 


0.141 


'og 10 Fp/-m 


-1.861 (-3.31 4- -0.407) 


0.012 


-1.779 (-3.21 3- -0.346) 


0.015 


log 10 F/?77-m 


1.400 (-1.733-4.532) 


0.381 


1.356 (-1.727 -4.439) 


0.389 



Linear regression analyses of the association between genotypic predictors and clinical parameters. Viral load values were log transformed and CD4 counts were 
square root transformed to approximate the normal distribution. For each of the genotypic predictors, two models including different sets of potential confounders 
were investigated. Model 1 included genotypic predictors for PR and RT, and estimated duration of infection (recent vs unknown duration). Model 2 additionally 
included age, gender, risk group and area of origin. 



Second, patients with genotypic evidence of TDR were 
excluded from the analysis (Table 4). In both approaches, 
the estimated fitness under the PI selective pressure model 
remained significantly associated with viral load and CD4 
count, while the association of the fitness under the RTI 
selective pressure model remained not significant. These 
results suggest that the correlations found are mainly 
attributable to polymorphic mutations in PR, that also 
confer a selective advantage under treatment selective 
pressure. 

Determining the contribution of individual mutations at 
polymorphic positions 

The fitness models were designed to predict fitness under 
drug selective pressure. In addition to well known muta- 
tions conferring (high-level) phenotypic drug resistance to 
protease inhibitors, mutations contributing to estimated 



fitness under the PR model included also known minor or 
accessory resistance mutations, either non-polymorphic 
or natural polymorphisms, and several other polymorphic 
positions (See Additional file 1: Fitness landscape). Uni- 
variate analysis with correction for multiple testing did 
not identify a single mutation that significantly affected 
viral load or CD4 cell count (data not shown). This sug- 
gested that combinatorial patterns of mutations in pro- 
tease rather than single-mutation effects explained the 
observed associations, arguing in favor of epistatic inter- 
actions between mutations. Naturally occurring poly- 
morphisms in protease (L10I/V, I13V, K20I/M/R, M36I, 
D60E, I62V, L63P, A71T/V, V77I and I93L) have previ- 
ously been associated with therapy experience, and were 
also modeled by the fitness model Fp/. We evaluated the 
correlation of the presence of these polymorphisms with 
viral load or CD4 count. Several of these compensatory 



Table 4 Regression analysis to predict viral load and CD4 count in patients without evidence of TDR 





Model 1 




Model 2 




Factors 


log VL (95% CI) 


p-value 


log VL (95% CI) 


p-value 


log io Fp/ 


0.298 (0.122-0.421) 


0.000 


0.269 (0.105-0.433) 


0.001 


log io Fpj-, 


-0.090 (-0.324-0.144) 


0.449 


-0.2 19 (-0.583 -0.1 44) 


0.237 


Factors 


sqrt CD4 (95% CI) 


p-value 


sqrt CD4 (95% CI) 


p-value 


log io Fp/ 


-2.038 (-3.556- -0.521) 


0.009 


-1.947 (-3446- -0.448) 


0.011 


log io Ffi77 


1.220 (-2.145-4.585) 


0.477 


1 .243 (-2.074 - 4.560) 


0.462 



Linear regression analyses of the association between genotypic predictors and clinical parameters. Patients that did show genotypic evidence of TDR were excluded 
from the analysis. For each of the genotypic predictors, two models including different sets of potential confounders were performed. Model 1 included genotypic 
predictors for PR and RT, and estimated duration of infection. Model 2 additionally included age, sex, risk group and area of origin. 
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mutations were highly polymorphic in the subtype B study 
population (Table 5). The distribution of these therapy- 
associated polymorphisms was similar in recently infected 
patients compared to patients with unknown duration of 
infection (data not shown). Figure 1 shows that a higher 
number of polymorphic mutations in protease was sig- 
nificantly associated (p-value < 0.01) with a higher viral 
load (la), a lower CD4 count (lb), and a higher esti- 
mated fitness (logio Fpi-m where major PI resistance 
mutations had been reverted) (lc). No significant trend 
was observed for RT polymorphic mutations, in agree- 
ment with the lack of association of estimated fitness 
under the RT drug selective pressure fitness model and 
viral load or CD4 count (data not shown). These results 
further support the hypothesis that polymorphic amino 
acids significantly contributed to the observed associa- 
tions between PR fitness estimated using a drug selective 
pressure fitness model and viremia or CD4 cell count in 
absence of treatment. 

Discussion 

A substantial number of newly diagnosed HIV-1 patients 
is infected with a drug resistant virus, carrying the 
footprints of drug selective pressure [4-6]. Reversion 
of major resistance mutations in absence of therapy, 
both in treatment-experienced patients and in patients 
with transmitted drug resistance [3,5], illustrates that 
these mutations contribute in vivo to a decrease in viral 
fitness in absence of drug pressure. Considering that 
TDR is largely defined by non-polymorphic treatment- 
related mutations, mainly reflecting major drug resis- 
tance mutations [20], TDR has initially been speculated 
to result in lower set-point viral loads and higher CD4 
cell counts, and consequently a slower disease progres- 
sion [11]. While studies have reported that transmis- 
sion of drug-resistant virus was associated with changes 
in initial viral load and CD4 counts in both serocon- 
verters and chronically HIV-infected patients [11,22], 
other studies could not corroborate these findings [23,24]. 
Compensatory mutations often accompany major resis- 
tance mutations, selected to restore impaired intrinsic 



replication capacity [21]. We hypothesized that these 
accessory mutations could also increase viral fitness in 
absence of drug selective pressure or major drug resis- 
tance mutations. If true, TDR could potentially contribute 
to a higher viral load and a lower CD4 cell count in ther- 
apy naive patients, which is the opposite as previously 
speculated. 

To investigate this hypothesis, we evaluated retrospec- 
tively the association of viral genotype with baseline 
viremia and CD4 cell count in a population of newly diag- 
nosed treatment-naive HIV-1 patients from the SPREAD 
study [4,5]. We quantified the genetic variability with 
respect to its potential contribution to drug selective 
pressure using the most recent WHO surveillance drug 
resistance mutation list [20], a set of known compen- 
satory mutations [21], and an vivo fitness landscape (FL) 
of drug selective pressure. In this study, viral load and 
CD4 cell count did not differ between patients with or 
without TDR, in agreement with earlier reports [23,24]. 
Evidence of TDR primarily consisted of single mutations, 
often 215 revertants [4,5], indicating that major resistance 
mutations had already reverted in the majority of the 
TDR patients, along with their deleterious effects on virus 
replication in absence of drug. FLs modeled the fitness 
contribution of a large number of mutations, including 
polymorphic compensatory mutations. We observed sig- 
nificantly higher viral loads and lower CD4 counts in 
patients infected with a virus deemed fitter by the PI drug 
selective pressure model, even after correcting for pos- 
sible confounders. This association remained significant 
after the exclusion of TDR patients from the analysis or 
of fitness contributions of major drug resistance muta- 
tions. These findings suggest that amino acids at poly- 
morphic positions in PR, more frequently observed in 
patients failing therapy, can increase in vivo fitness in 
absence of therapy and major resistance mutations. The 
correlation between the number of known polymorphic, 
compensatory mutations in PR [21] and changes in viral 
load and CD4 count supported the results of estimated 
fitness. Regression analyses showed relatively small cor- 
relation coefficients, indicating that observed variability 



Table 5 Prevalence of polymorphic compensatory mutations in subtype B protease 


Mut 


% 


n 


Mut 


% 


n 


101 


10.02 


160 


60E 


9.2 


147 


10V 


2.76 


44 


62V 


32.75 


523 


13V 


17.09 


273 


63 P 


58.42 


933 


20I 


0.13 


2 


71T 


9.58 


153 


20M 


0.69 


11 


71V 


7.2 


115 


20R 


3.44 


55 


77I 


32.19 


514 


36! 


17.72 


283 


93L 


42.45 


678 



The prevalence of known polymorphic compensatory mutations [21] in the subtype B study population is shown as percentages (%) and absolute count (n). 
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Correlation with estimated fitness 
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Figure 1 Association between the number of compensatory 
mutations, estimated fitness and clinical parameters. 

Compensatory mutations, polymorphic in subtype B protease and 
modeled by the fitness model Fp/, are likely candidates to explain the 
observed association of viral fitness estimated under drug selective 
pressure with clinical parameters. For patients with no indications of 
acute infection and TDR (n = 962), the number of compensatory 
mutations (1 3V, 361, 60E, 62V, 63P, 71V, 71T, 77I or 93L) in the protease 
sequence is calculated for each patient [21]. The following parameters 
are grouped by mutation count: logio viral RNA copies/ml (1 a), 
square-root transformed CD4 cell counts (1 b) and increased 
estimated fitness for protease logio Fp;-m (1c). The distribution of the 
respective parameter is shown for each group using boxplots.The 
horizontal line (bold) within the boxplot represents the median value, 
with box boundaries indicating the interquartile range. Upper and 
lower ends of striped lines denote the most extreme data point which 
is no more than 1 .5 times the IQR range from the box. An increased 
mutation number significantly correlated with 1a) increased logio 
viral RNA copies/ml (p-value < 0.01), 1 b) decreased square-root 
transformed CD4 cell counts (p-value < 0.01) and 1c) increased 
estimated fitness for protease logio Fp/-m (p-value < 0.01). A fitted 
line going through the median values (lowess smooth) is shown in 
red. The number of patients for each group is shown above each bin. 



in clinical parameters could only be partially explained, 
and emphasizing on a larger role of other host- and virus- 
related factors at play in absence of treatment. Viremia 
showed a stronger relationship with estimated fitness, 
compared to CD4 count, which would be consistent with 
the notion that viremia is a more direct outcome of the 
viral life cycle. Estimated fitness by the RTI drug selective 
pressure model did not significantly correlate with viral 
load and CD4 cell count. However, this fitness landscape 
model and the analyzed sequences included only the part 
of RT that is usually sequenced for drug resistance pur- 
poses (position 1 to 230), and therefore the full potential of 
treatment-related variability in RT remains to be further 
explored. The fact that the PR gene region is less con- 
served not only between different subtypes [25], but also 
within a single subtype [26], may indicate that purifying 
selection acts stronger on RT, suggesting it may be easier 
to find fitness differences in PR compared to RT. 

In vitro studies have observed a wide range of HIV-1 
replication capacity of pol genotypes both from recently 
and chronically infected treatment-naive patients, even 
after excluding viruses with genotypic evidence of drug 
resistance [17,27-30]. Patients infected with a virus show- 
ing lower in vitro pol replication capacity had significantly 
lower baseline HIV-1 RNA levels [27,28] and higher base- 
line CD4 cell counts. This association was independent 
of duration of infection and presence of drug resistance 
[17,28], and showed to be predictive for disease pro- 
gression [28,30]. Although in vitro replication capacity is 
differently quantified than our modeled in vivo fitness, and 
neither is a direct measurement of in vivo fitness, these 
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reports further suggest that improved replication capacity 
increases the virulence of HIV-1. 

Importantly, although our study shows that compen- 
satory mutations can increase in vivo fitness in absence 
of deleterious resistance mutations, we do not claim that 
this higher fitness was caused by transmitted drug resis- 
tance, as many compensatory mutations constitute natu- 
ral genotypic variation in subtype B [12]. However, given 
that these mutations are frequently selected under drug 
selective pressure, our results do indicate the potential 
of transmitted drug resistance to impact on the natural 
history of HIV-1 at a population level. While still highly 
speculative, the observed correlation with CD4 count sug- 
gests in addition a worse clinical outcome in these therapy 
naive patients that harbor a fitter virus, whether or not the 
responsible polymorphisms were naturally present [12] or 
resulted from TDR. If ART selects in addition to antivi- 
ral resistance for more efficient HIV-1 enzymes, and such 
substitutions are being transmitted, then these alterations 
should be detectable at the epidemic level. With the cur- 
rent problem of transmitted drug resistance, we predict 
that polymorphisms improving virus fitness in both pres- 
ence and absence of drugs may increase in prevalence over 
time. Furthermore, we speculate that TDR may be chang- 
ing the HIV-1 epidemic and potentially pushing the virus 
to become more pathogenic. Additional investigation is 
warranted to confirm this proposed potential of TDR. 

Our study suggests an intricate connection between 
HIV-1 natural diversity, protease plasticity, changes in 
viral fitness and potentially disease progression. While 
individual mutations, resistance related or polymorphic, 
have been studied in terms of enzyme activity, replicative 
capacity and drug susceptibility [13,14,31-33], systematic 
analyses are needed to better understand their combined 
effect through complex epistatic interactions as suggested 
in this manuscript. Similar combinatorial complexity may 
be anticipated for mutations in protease interacting with 
cleavage sites in the gag polyprotein [34], and CTL epi- 
topes involved in viral escape from host selective pressure 
[35]. Empirical studies can further address difficulties in 
the interpretation of relative estimated fitness, which was 
originally conceived and validated in an environment of 
treatment pressure, and in the translation to clinical impli- 
cations. 

A criticism of our findings could be that if compensatory 
mutations increase in vivo fitness in absence of therapy, 
viruses with fitter protease should already have been nat- 
urally selected given that HIV-1 subtype B viruses have 
been circulating for decades in the human population. In 
this respect, it is important to discriminate within-host 
from between-host selective pressure. The within-host 
evolution in absence of drug selective pressure is dom- 
inated by immune selective pressure exerting a strong 
diversifying selection mainly on the immunodominant 



envelope [36]. The prevalence of particular polymor- 
phisms in the infected human population is however 
dependent on the epidemic fitness of HIV, which is a very 
complex parameter and governed by between-host effects 
such as transmission efficiency and the number of new 
infections per infected individual. These effects depend 
not only on the level of the viral load, but also on the life 
expectancy of the transmitter (length of the asymptomatic 
phase) which may decrease with increasing viral load. 

One could also argue that, since protease TDR is 
decreasing over calendar year [37], we observed the same 
trend in compensatory mutations such that those patients 
with more compensatory mutations were actually infected 
for a longer time. This confounding factor was taken into 
account by performing a separate analysis whereby TDR 
patients were excluded from the analysis. In this addi- 
tional analysis, we still detected a significant association 
between presence of compensatory mutations in protease, 
higher viral load and lower CD4 count. 

A major limitation of this study is the cross-sectional 
design, consecutive measurements of viral load and CD4 
count are needed to better assess the impact of the 
reported mutations on disease progression. Indeed, CD4 
cell count decline and viral load rise are associated with 
longer duration of infection and natural disease pro- 
gression. Therefore, it can not be excluded that natural 
disease progression is associated with a rise in preva- 
lence of these reported mutations, and what we observe 
is merely a consequence of this process. However, in 
our cohort, neither TDR, estimated fitness values nor 
the number of known compensatory mutations differed 
between recently and chronically infected patients. Sec- 
ond, the regression model was extended with a variable 
indicating whether a patient was recently infected or not, 
which did not change the results. We are also not aware 
of any literature indicating that the here reported muta- 
tions increase in prevalence during the natural course 
of disease progression, while if they did, this would still 
not invalidate our hypothesis that these mutations have 
a negative impact on disease progression. In that case, 
we could interpret the transmission of such mutations 
as a head start of the virus in the course of disease 
progression. 

To end, with respect to an individual HIV-1 infected 
patient, resistance-associated virological failure severely 
limits ART options due to the persistence of acquired 
drug resistance and the existence of within-class cross- 
resistance. We now suggest that the consequences also 
reach beyond a single patient as high rates of resistance 
detected at therapy failure increase the probability of TDR 
[38], transferring genotypic footprints of adaptive evolu- 
tion under drug pressure to newly infected hosts [4]. If 
our hypothesis is true, in the newly infected patient, this 
may not only affect therapy response, but also change the 
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natural course of disease progression. The impact of fitter 
virus in drug naive patients on therapy effectiveness is cur- 
rently unknown, though better replicating virus can lower 
the genetic barrier to resistance by facilitating resistance 
development and compromising the long term benefit of 
antiretroviral treatment both at the individual and the 
population level [39,40]. 

Conclusions 

It has previously been shown that in absence of treatment 
drug-resistant HIV-1 usually displays a lower replication 
capacity than wild-type virus due to the deleterious 
effect of major resistance mutations. These major resis- 
tance mutations tend to revert quickly after transmis- 
sion to a new host, while compensatory mutations that 
restore enzymatic efficiency rather than confer antiviral 
resistance do not experience this selective pressure for 
reversion. Our results lead us to speculate that antiviral 
treatment pressures the virus to optimize its enzymes and 
is selecting for a more fit, and possibly also more vir- 
ulent virus at population level. Increased availability of 
antiretroviral treatment and transmission of treatment- 
selected polymorphisms could have clinical implications 
both at the individual and the population level. These find- 
ings could provide additional complexity to the current 
and ongoing controversy on whether HIV-1 virulence is 
changing over time [41]. 

Methods 

Ethics statement 

Ethical requirements are fulfilled according to the pro- 
cedure described in the EC contract. The procedure dif- 
fers among the 32 countries in the network according to 
national legislation (the national reference laboratory and 
corresponding national coordinator for each country are 
listed in Additional file 3). Briefly, for each participating 
hospital or collection center, approval was obtained by the 
institutional medical ethical review committee. Addition- 
ally, a written informed consent was obtained for each 
patient. In countries where a mandatory surveillance sys- 
tem was already established, legally no informed consent 
was needed. All surveillance data were made anonymous 
and coded at national level. 

Study population 

The European SPREAD project is a surveillance pro- 
gramme that prospectively collected representative 
data of HIV-1 infected individuals, newly diagnosed 
between September 2002 and December 2005 in 20 
European countries (Austria, Belgium, Cyprus, Czech 
Republic, Germany, Denmark, Spain, Finland, Greece, 
Ireland, Italy, Luxembourg, the Netherlands, Norway, 
Poland, Portugal, Sweden, Slovenia, Slovakia and Serbia) 
and Israel. Previous reports of the SPREAD programme 



focused on the analyses of transmitted drug resistance and 
HIV-1 subtype distribution in Europe [5,6]. A standard- 
ized sampling strategy was designed by the epidemiology 
expert group of the SPREAD programme to ensure repre- 
sentative sampling in all countries. Patients were eligible 
if they had not been submitted to antiretroviral therapy by 
the time of sampling and if they were at least 18 years old. 
Furthermore, the first available sample obtained within 6 
months of HIV-1 diagnosis was used, with a predefined 
viral load inclusion threshold above 1000 HIV-1 RNA 
copies/ml, since that was the threshold defined neces- 
sary at the time for reliable genotype testing. The few 
genotypes associated to a sample with a lower viral load 
were therefore considered a source of bias and excluded 
from all SPREAD studies. Epidemiological, clinical and 
behavioral data were collected using a standardized ques- 
tionnaire. Patients were defined as recently infected if 
they had documented negative or indeterminate HIV-1 
serological results up to 12 months prior to confirmation 
of diagnosis. The remaining newly diagnosed patients 
were classified as those with undefined duration of infec- 
tion. In line with other clinical cohort studies, HIV-RNA 
viral load levels and CD4 cell counts used in this study 
were determined locally using assays validated for clinical 
use. Genotypic analysis was decentralized by local labo- 
ratories using either in-house methods or commercially 
available genotypic resistance testing kits, and the raw 
nucleotide sequence data were used in the current study. 
All laboratories participated in a continuous blinded 
quality control programme to verify the quality of the 
data. 

HIV-1 fitness landscape 

HIV-1 adaptation to drug selective pressure can be mod- 
eled by observing viral evolution in patients at treat- 
ment failure. When the same mutation is independently 
fixed in multiple patients under selective pressure of the 
same treatment, it may be assumed that this convergent 
evolution indicates an increased fitness of the mutant 
virus under that treatment. Since a synergistic interaction 
between two mutations is expected to lead to a differ- 
ent observed prevalence of one mutation depending on 
the presence of the other, observed associations in preva- 
lence may indicate epistatic fitness interactions between 
these mutations [42]. Hence, a fitness function of HIV-1 
can be learned based on the difference in prevalence of 
mutations in viral sequences from treatment-experienced 
patients compared with untreated patients. Previously, we 
developed an evolutionary framework to reconstruct such 
an in vivo fitness landscape (FL), by quantitatively esti- 
mating the influence of mutations and mutation patterns 
on HIV-1 fitness during treatment, solely as a function 
of the genotypic sequence. This estimated fitness under 
treatment pressure reflects the combined effect of drug 
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resistance and replication capacity. Briefly, a fitness func- 
tion was constructed following a two-step process (see 
[19] and Additional file 1 for a detailed description of the 
methodology). First, conditional dependencies between 
mutations were identified using a probabilistic model that 
efficiently summarized observed correlations between 
mutations. For each mutational pattern that was modeled, 
the fitness function specified a separate fitness contribu- 
tion. Secondly, the selective advantage of each incorpo- 
rated mutation interaction was quantified in an iterative 
step, where the viral evolution under treatment was sim- 
ulated using the fitness landscape. Differences between 
predicted and observed prevalence of the included 
mutation patterns were measured and minimized, and 
fitness values were optimized until convergence was 
reached. 

We previously constructed FLs for specific protease 
inhibitors (PI) [19], or for specific combinations of reverse 
transcriptase inhibitors (RTI) [43] and have shown that 
fitness models can significantly predict therapy response 
in vivo [44]. For the current study, we constructed two 
generic fitness functions, one modeling the selective pres- 
sure on protease (PR) by any protease inhibitor (PI), and 
one modeling the selective pressure on reverse transcrip- 
tase (RT) by any reverse transcriptase inhibitor (RTI) (See 
Additional file 1). To build the fitness functions, data 
were pooled from Portugal, Belgium and the Stanford 
Drug Resistance database only using data independent 
from data within the SPREAD program. A total of 3751 
sequences were from patients treated with one or more 
Pis and 8328 sequences from PI naive patients. A total set 
of 1736 sequences were from patients treated with one or 
more RTIs and 3769 sequences from RTI naive patients. 
The models not only included (major) drug resistance 
mutations, but also any polymorphism with a prevalence 
of >1% (PI) or >3% (RTI) in the respective treated popu- 
lation. The PR fitness landscape ¥Lpj included 104 muta- 
tions and contained 898 different mutation interactions. 
The RT landscape FL/ot included 112 mutations and 
modeled 1172 possible mutation patterns. (See Additional 
file 1). The two fitness landscapes were scaled to a fitness 
of 1 for HIV-1 subtype B reference strain HXB2. For any 
given sequence, the fitness landscape computes a single 
fitness value that represents the relative fitness compared 
to HXB2. 

Genotypic predictions 

Viral subtype was assessed on the combined PRO-RT 
sequence using the REGA HIV-1 subtyping tool V2 [45]. 
Evidence of transmitted drug resistance was defined as 
the presence of at least one surveillance drug-resistance 
mutation [20]: 231, 241, 30N, 321, 46IL, 47 AV, 48MV, 50VL, 
53LY, 54AMLSTV, 73ACST, 76V, 82ACFLMST, 83D, 
84ACV, 85V, 88DS or 90M in PR; and 41L, 65R, 67EGN, 



69insD, 70ER, 74IV, 75AMST, 77L, 1001, 101EP, 103NS, 
106AM, 115F, 116Y, 151M, 179F, 181CIV, 184IV, 188CHL, 
190AES, 210W, 215CDEFISVY, 219ENRQ, 225H or 230L 
in RT. Major drug resistance mutations were defined 
according to the International AIDS Society USA (IAS- 
USA) [19]. For PR, these were 30N, 321, 33F, 43T, 46L, 47V, 
48V, 50LV, 54LMV, 58E, 74SP, 76V, 82AFT, 84V, 88DS, 
and 90M. For RT these were: 41L, 44D, 62V, 65R, 67N, 
70R, 74IV, 751, 77L, 1001, 103N, 106AM, 1081, 115F, 116Y, 
1181, 151M, 181C, 188CHL, 190AS, 210W, 215FY, 219EQ 
and 225H. Compensatory drug resistance mutations, all 
polymorphic variants in wild-type virus, were defined 
according to Shafer et al. [21]. For PR, these were 13V, 361, 
60E, 62V, 63P, 71V, 71T, 771 and 93L. For RT, polymorphic 
accessory mutations were 98S, 101R, 101Q, 1061, 138A, 
1791 and 238R. For a given genotype in the study popula- 
tion, fitness values under PI (Vpi) and RTI (Frti) selective 
pressure, were computed using the models ¥Lpi and FLrti 
respectively. Such a FL scores a higher fitness to both 
major resistance mutations and compensatory mutations, 
while in absence of drugs our hypothesis would assign a 
negative impact of the major resistance mutation and a 
positive impact of the compensatory mutation. Therefore, 
the effect of major resistance mutations was excluded in 
two ways, which allowed to capture the contribution of 
the compensatory mutations. A first approach constituted 
the exclusion of the subset of patients with indications 
of TDR [20] from the analysis. In a second approach, an 
additional fitness value (Fp/_ m or F#77_ m ) was computed 
for sequences that displayed major resistance mutations, 
by reverting the major resistance mutation to the corre- 
sponding wild-type amino acid. 

Statistical analysis 

TDR prevalence values were calculated with a 95% Wilson 
score confidence interval based on the binomial distribu- 
tion. A login transformation of viral load and a square 
root transformation of CD4 count were applied in order to 
obtain a normal distribution of variance. For continuous 
variables, comparisons between means were conducted 
by using a t-test or a Mann Whitney U test. For cate- 
gorical variables, comparisons between proportions were 
conducted by using the contingency-table X2 test. Multi- 
variate linear models were constructed to determine the 
ability of the genotypic factors to predict CD4+ T cell 
counts (square root transformed) and plasma HIV-1 RNA 
levels (logio transformed). Two multivariable models were 
fitted. A first model included, besides genotypic predic- 
tors, a variable indicating evidence of acute infection in 
order to take into account the bias of patients that had 
not reached the set-point viral load [46,47]. In a second 
model, we adjusted for a number of additional potential 
confounders listed in Table 1. All analyses were performed 
using the statistical software R (version 2.12.0). 
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list of mutations included in the protease or reverse transcriptase fitness 
function. 
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